PyDigger - unearthing stuff about Python


NameVersionSummarydate
SurvivalEVAL 0.4.4 The most comprehensive Python package for evaluating survival analysis models. 2025-07-12 22:46:15
rag-evaluation 0.2.1 A robust Python package for evaluating Retrieval-Augmented Generation (RAG) systems. 2025-07-12 22:38:32
novaeval 0.3.2 A comprehensive, open-source LLM evaluation framework for testing and benchmarking AI models 2025-07-12 20:19:59
RadEval 0.0.1rc0 All-in-one metrics for evaluating AI-generated radiology text 2025-07-12 17:31:40
pypitest-radeval 0.0.3 All-in-one metrics for evaluating AI-generated radiology text 2025-07-11 14:54:29
langsmith 0.4.5 Client library to connect to the LangSmith LLM Tracing and Evaluation Platform. 2025-07-10 22:08:04
AgentDS-Bench 1.2.2 Python client for AgentDS-Bench: A streamlined benchmarking platform for evaluating AI agent capabilities in data science tasks 2025-07-09 21:21:17
agenta 0.49.3 The SDK for agenta is an open-source LLMOps platform. 2025-07-09 13:29:26
open-rag-eval 0.2.0 A Python package for RAG Evaluation 2025-07-08 17:20:26
benchwise 0.1.0a1 The GitHub of LLM Evaluation - Python SDK 2025-07-08 10:16:01
guidellm 0.2.1 Guidance platform for deploying and managing large language models. 2025-04-29 17:49:39
evo 1.31.1 Python package for the evaluation of odometry and SLAM 2025-03-20 15:37:42
ragmetrics-client 0.1.9 Monitor your LLM calls. Test your LLM app. 2025-03-14 23:05:52
math-verify 0.7.0 HuggingFace library for verifying mathematical answers 2025-02-27 16:21:04
trajectopy 2.4.2 Trajectory Evaluation in Python 2025-02-26 08:34:59
quotientai 0.1.9 CLI for evaluating large language models with Quotient 2025-02-25 18:40:21
python-lilypad 0.0.23 An open-source prompt engineering framework. 2025-02-25 03:25:39
providentia 2.4.0 Providentia is designed to allow on-the-fly, offline and interactive analysis of experiment outputs, with respect to processed observational data. 2025-02-12 13:36:50
maihem 1.7.3 LLM evaluations and synthetic data generation with the MAIHEM models 2025-02-11 16:54:39
trust_eval 0.1.5 Metric to measure RAG responses with inline citations 2025-02-11 04:42:29
hourdayweektotal
3812877966295345
Elapsed time: 1.67528s